skip to main content


Search for: All records

Creators/Authors contains: "Kapritsos, Manos"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Understanding and debugging the performance of distributed systems is a notoriously hard task, but a critical one. Traditional techniques like logging, tracing, and benchmarking represent a best-effort way to find performance bugs, but they either require a full deployment to be effective or can only find bugs after they manifest. Even with such techniques in place, real deployments often exhibit performance bugs that cause unwanted behavior. In this paper, we present Performal, a novel methodology that leverages the recent advances in formal verification to provide rigorous latency guarantees for real, complex distributed systems. The task is not an easy one: it requires carefully decoupling the formal proofs from the execution environment, formally defining latency properties, and proving them on real, distributed implementations. We used Performal to prove rigorous upper bounds for the latency of three applications: a distributed lock, ZooKeeper and a MultiPaxos-based State Machine Replication system. Our experimental evaluation shows that these bounds are a good proxy for the behavior of the deployed system and can be used to identify performance bugs in real-world systems. 
    more » « less
    Free, publicly-accessible full text available June 6, 2024
  2. In this paper, we examine the Paxos protocol and demonstrate how the discrete numbering of ballots can be leveraged to weaken the conditions for learning. Specifically, we define the notion of consecutive ballots and use this to define Consecutive Quorums. Consecutive Quorums weaken the learning criterion such that a learner does not need matching accept messages sent in the same ballot from a majority of acceptors to learn a value. We prove that this modification preserves the original safety and liveness guarantees of Paxos. We define Consecutive Paxos which encapsulates the properties of discrete consecutive ballots. To establish the correctness of these results, in addition to a paper proof, we formally verify the correctness of a State Machine Replication Library built on top of an optimized version of Multi-Paxos modified to reflect Consecutive Paxos. 
    more » « less
  3. This paper presents Aegean, a new approach that allows fault-tolerant replication to be implemented beyond the confines of the client-server model. In today’s computing, where services are rarely standalone, traditional replication protocols such as Primary-Backup, Paxos, and PBFT are not directly applicable, as they were designed for the client-server model. When services interact, these protocols run into a number of problems, affecting both correctness and performance. In this paper, we rethink the design of replication protocols in the presence of interactions between services and introduce new techniques that accommodate such interactions safely and efficiently. Our evaluation shows that a prototype implementation of Aegean not only ensures correctness in the presence of service interactions, but can further improve throughput by an order of magnitude. 
    more » « less
  4. Designing and implementing distributed systems correctly is a very challenging task. Recently, formal verification has been successfully used to prove the correctness of distributed systems. At the heart of formal verification lies a computer-checked proof with an inductive invariant. Finding this inductive invariant, however, is the most difficult part of the proof. Alas, current proof techniques require inductive invariants to be found manually—and painstakingly—by the developer. In this paper, we present a new approach, Incremental Inference of Inductive Invariants (I4), to automatically generate inductive invariants for distributed protocols. The essence of our idea is simple: the inductive invariant of a finite instance of the protocol can be used to infer a general inductive invariant for the infinite distributed protocol. In I4, we create a finite instance of the protocol; use a model checking tool to automatically derive the inductive invariant for this finite instance; and generalize this invariant to an inductive invariant for the infinite protocol. Our experiments show that I4 can prove the correctness of several distributed protocols like Chord, 2PC and Transaction Chains with little to no human effort. 
    more » « less
  5. Distributed systems are notoriously difficult to design and implement correctly. Formal verification provides correctness proofs, and has recently been successfully applied to various distributed systems. At the heart of a typical formal verification is a computer-checked proof with an inductive invariant. Finding this inductive invariant is the hardest part of the proof: a part that is currently undertaken manually by the developer and is responsible for most of the effort associated with formal verification. In this paper, we present a new approach: Incremental Inference of Inductive Invariants (I4), to automatically generate inductive invariants for distributed protocols. We start from a simple idea: the inductive invariant of a finite instance of the protocol must be an instance of a general inductive invariant for the infinite distributed protocol. In I4, we instantiate a finite instance of the protocol, work out the finite inductive invariant of this instance, then figure out the general inductive invariant as a generalization of the finite invariant. Our experiments show that I4 can finish the general proof of correctness of several systems with minimal human effort. 
    more » « less